A friend from my Scrabble club contacted me the other day. He was looking
for a way to find if there was any way of overlapping 5 words of length 7 so
that they can be played one at a time, below the previous word. His program
had been running for 3 hours, and still had not completed. He was wondering
if there were any ways to build some efficiencies into the code.

When I asked how he was building the comparisons, I found out that he was
using nested loops, where if the newest word was placed at the bottom, then
all of the positions were checked vertically to make sure each one was an
acceptable word

My first thought was that to build our words of 2 to 5 letters that we
wanted to check, we do not want to use all of words. We only want to use the
5 letter words that are "buildable" (ie. the first 2 letters, first 3 letters, and
first 4 letters are acceptable words in their own right). By culling the 5 letter
words to this "buildable" category we reduced the number of 5 letter words used from
9,403 to 1,264!

At the same time we only want to check against the 2 letter, 3 letter, and
4 letter words that start one of these "buildable" 5 letter words, since if
an overlapping set of size less than 5 has no future of developing into a 5
word overlap if they are not built from one of these shorter words.

To also speed the lookup process, I created a dictionary of each acceptable
string of the shorter words, and what letters can build another acceptable word.
Instead of trying every word we can use Python's list comprehensions to filter
for all words that "fit" under the current set - checking starting at position 0
and proceeding to the end - stopping if no words meet the conditions given.

(By this time my friend contacted me - the program found an answer at around 3 1/2 hours!)

I put my prototype code together. To make sure it was running with a good amount
of speed I put a "stopwatch" routine into the program. I used one from someone who
kindly provided some code on stackoverflow.com. Watching the code run I came to find
that the code was not trustworthy - it only showed 24 minutes in an hour!

Once I put all the code together it seemed to have found some efficiency:

Program Completed!!!
2 answers found
PATAMAR-AMARONE-CURETTE-ESTATED-DESEEDS
PATAMAR-AMARONE-CURETTE-ESTATED-RESEEDS
0 Hours, 7 Minutes, 2 Seconds

So, by taking the overhead time of making the dictionary at the beginning of the
program (about 1 minute), the overall code was 30 times quicker in finding the solutions!

The Python version of my code is listed below. If anyone is interested in my C#
version of the same code (which I wrote to show my concepts to my friend, who primarily
programs in C# at his job) please feel free to contact me at apengelly@golden.net

---PYTHON CODE LISTED BELOW---

import time

def stopWatch(value):

    '''This subroutine changes the time in seconds to a nice display value'''

    valueD = (((value/24)/60)/60)
    Days = int (valueD)

    valueH = (valueD - Days)*24
    Hours = int(valueH)

    valueM = (valueH - Hours)*60
    Minutes = int(valueM)

    valueS = (valueM - Minutes)*60
    Seconds = int(valueS)

    print("{} Hours, {} Minutes, {} Seconds".format(Hours, Minutes, Seconds))


def buildWord(matrix, dispstr):

    '''This subroutine is recursive. Instead of iterating over all sevens,
     it uses the lookup dictionary created in the main program (five_tree)
     to filter all sevens by the possible letters that could be next to
     form the five letter words. By filtering each possibility one at a
     time it saves time by not having to consider all of the words'''

    # copy the original lists, so that they will not be altered
oldmatrix = matrix.copy()
currsevens = sevens.copy()
    i = 0
    while len(currsevens) > 0 and i < 7:

         # filter the seven letter words by if the letter in the
         # proper position is within the possibilities for building
         # each of the five letter words

         currsevens = [word for word in currsevens if word[i]
             in five_tree[matrix[i]]]
         i += 1
    if len(currsevens) != 0:

         # if there are any words that do fit, add it to the current list
         # if the number of sevens now added is 5 then this is a solution,
         # otherwise recur the subroutine to find if another seven can be
         # added.

         for newword in currsevens:
             for i,char in enumerate(newword):
                 matrix[i] += char
             if len(matrix[0])==5:
                 solution_list.append(dispstr + "-" + newword)
             else:
                 newdisp = dispstr + "-" + newword
                 buildWord(matrix, newdisp)
             matrix = oldmatrix.copy()


''' Main Program '''

# Collect the start time (counts in seconds)

start = time.time()

# Initialize the lists

wordlist = []
fivechar = [[],[],[],[],[]]
fivebuild = [[],[],[],[],[],[]]
solution_list = []

# ... and the dictionary

five_tree = {}

print("Initiating Lists...")

# Load the word list

wordfile = open(r"C:\Users\Apeng\OneDrive\Documents\OWLLIST.txt")
for line in wordfile:
     wordlist.append(line.strip())
wordfile.close()

# Filter the word list to pull out all 5 letter words
# where the first 2 letters, the first 3 letters, and
# the first 4 letters all make valid words.

fives = [word for word in wordlist if (len(word)==5
             and word[0:2] in wordlist
             and word[0:3] in wordlist
             and word[0:4] in wordlist)]
print (len(fives))
currtime = time.time()
stopWatch(currtime-start)

# Make a list of all of the 7 letter words

sevens = [word for word in wordlist if len(word)==7]

# Make lists of all of the opening strings for each length
# from 1 to 5. The use of the word "set" is to get unique
# values. So the first 4 characters of BORES and BORED
# would only be represented once in the list

for i in range(1,5):
     fivebuild[i]=list(set([word[0:i] for word in fives]))
fivebuild[5] = fives

# Build a dictionary where each string points to the letters that can follow
# in the paths to build the 5 letters words. The key will be the lead string
# while the value is a list of the letters that can follow.

for i in range(1,5):
     for node in fivebuild[i]:
         next_path = [word[-1] for word in fivebuild[i+1] if word[0:i]==node]
         five_tree[node] = next_path

# Some of the letters that may be in our first words do not start aby words
# that can be build letter by letter into a 5 letter word (example: 'I').
# Add in the letters to the dictionary, pointing to a value of an empty list.

all_letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
for letter in all_letters:
     if letter not in five_tree.keys():
         five_tree[letter]=[]

for i, word in enumerate(sevens):

     # Split all of the characters in the seven letter word into separate
     # entries in a list, that can then be built off of.

     chars = []
     for char in word:
         chars.append(char)

     # Every 100 sevens display a message indicating where we are in the program
     # and how much time has expired

     if i % 100 == 0:
         print("Considering word {} of {} - {}".format(i+1, len(sevens), word))
         currtime = time.time()
         stopWatch(currtime-start)

     # Call the main recursive function

     buildWord(chars,word)


# Get the end time, and display any solutions found, in addition to how
# much time expired in the running of the program.

end = time.time()
print("Program Completed!!!")
print("{} answers found".format(len(solution_list)))
for solution in solution_list:
     print(solution)
stopWatch(end-start)

If interested, please view my resume